Data stream as disjoint intervals

Time: O(LogN); Space: O(N); hard

Given a data stream input of non-negative integers a1, a2, …, an, …, summarize the numbers seen so far as a list of disjoint intervals.

For example, suppose the integers from the data stream are 1, 3, 7, 2, 6, …, then the summary will be:

[1, 1]
[1, 1], [3, 3]
[1, 1], [3, 3], [7, 7]
[1, 3], [7, 7]
[1, 3], [6, 7]

Example 1:

Input/Output:

s = SummaryRanges()
s.addNum(1)
s.getIntervals()   # [[1,1]]
s.addNum(3)
s.getIntervals()   # [[1,1],[3,3]]
s.addNum(7)
s.getIntervals()   # [[1,1],[3,3],[7,7]]
s.addNum(2)
s.getIntervals()   # [[1,3],[7,7]]
s.addNum(6)
s.getIntervals()   # [[1,3],[6,7]]

Explanation

  • add(1)->get([[1, 1]])->

  • add(3)->get([[1, 1], [3, 3]])->

  • add(7)->get([[1, 1], [3, 3], [7, 7]])->

  • add(2)-merge(1,2,3)->get([[1, 3], [7, 7]])->

  • add(6)->merge(6,7)->get([[1, 3], [6, 7]])

Example 2:

Input/Output:

s = SummaryRanges()
s.addNum(4)
s.getIntervals()     # [4,4]
s.addNum(3)
s.getIntervals()     # [3,4]

Explanation

  • add(4)->get([[4, 4]])->

  • add(3)->merge(3,4)->get([[3, 4]])

Follow up:

  1. What if there are lots of merges and the number of disjoint intervals are small compared to the data stream’s size?

[1]:
class Interval(object):
    def __init__(self, s=0, e=0):
        self.start = s
        self.end = e
[6]:
class SummaryRanges(object):

    def __init__(self):
        self.__intervals = []

    def addNum(self, val):
        """
        :type val: int
        :rtype: None
        """
        def upper_bound(nums, target):
            left, right = 0, len(nums) - 1
            while left <= right:
                mid = left + (right - left) // 2
                if nums[mid].start > target:
                    right = mid - 1
                else:
                    left = mid + 1
            return left

        i = upper_bound(self.__intervals, val)
        start, end = val, val
        if i != 0 and self.__intervals[i-1].end + 1 >= val:
            i -= 1
        while i != len(self.__intervals) and \
              end + 1 >= self.__intervals[i].start:

            start = min(start, self.__intervals[i].start)
            end = max(end, self.__intervals[i].end)
            del self.__intervals[i]

        self.__intervals.insert(i, Interval(start, end))

    def getIntervals(self):
        """
        :rtype: List[Interval]
        """
        return self.__intervals
[7]:
s = SummaryRanges()
s.addNum(1)
r = s.getIntervals()
assert [r[0].start,r[0].end] == [1, 1]
s.addNum(3)
r = s.getIntervals()
assert [[r[0].start,r[0].end], [r[1].start,r[1].end]] == [[1, 1],[3,3]]
s.addNum(7)
r = s.getIntervals()
assert [[r[0].start,r[0].end], [r[1].start,r[1].end], [r[2].start,r[2].end]] == [[1,1], [3,3], [7,7]]
s.addNum(2)
r = s.getIntervals()
assert [[r[0].start,r[0].end], [r[1].start,r[1].end]] == [[1, 3], [7,7]]
s.addNum(6)
r = s.getIntervals()
assert [[r[0].start,r[0].end], [r[1].start,r[1].end]] == [[1, 3], [6,7]]

s = SummaryRanges()
s.addNum(4)
r = s.getIntervals()
assert [r[0].start,r[0].end] == [4, 4]
s.addNum(3)
r = s.getIntervals()
assert [r[0].start,r[0].end] == [3, 4]